{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div style=\"text-align: right\"><i>Peter Norvig<br>May 2025</i></div>\n",
    "\n",
    "# Seven-Game Series?\n",
    "\n",
    "This time of year the basketball playoffs are in full swing. I have a pet peeve: analysts who say *\"These are two evenly matched teams. I expect the series will go seven games.\"* Is that really true? If each game is a 50/50 tossup, how often will this result in a seven-game series? How does the home-court advantage come into play? What if one team is slightly better? This notebook examines these questions. \n",
    "\n",
    "## Vocabulary of Concepts\n",
    "\n",
    "Here are the concepts I want to model, and the way I will represent them:\n",
    "\n",
    "- **Series**: Two teams play games until one wins four games. A game cannot be a tie. So a series can take anywhere from 4 to 7 games.\n",
    "- **Seeded team**: The team that is seeded higher (due to regular season wins) gets to host 4 of the potential 7 games (games 1, 2, 5, and 7).\n",
    "- **Series outcome**: A pair of integers, for example `(4, 3)` means that the seeded team has 4 wins and the other team has 3.\n",
    "- **Outcome distribution**: The class `Dist` will represent a probability distribution over possible series outcomes. \n",
    "- **Game win probability**: We assume  there is a given fixed probability, *p*, that the seeded team will win a single game on a neutral court.\n",
    "- **Home-court advantage**: However, there is a home-court advantage of probability *h*. So the seeded team has probability *p* + *h* of winning each of the games 1, 2, 5, and 7, and probability *p* - *h* of winning each of the other games.\n",
    "\n",
    "\n",
    "Below is the class `Dist`. For example, `Dist({(4, 3): 0.5, (3, 4): 0.5})` represents a series that goes seven games, and each team has an equal chance to win. *Note*: `Dist` inherits from `Counter`, so you can add two dists together, and I define a `__mul__` method so you can multiply a dist by a probability."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from collections import Counter\n",
    "\n",
    "class Dist(Counter):\n",
    "    \"\"\"A Probability Distribution of {outcome: probability} results.\"\"\"\n",
    "    def __mul__(self, weight: float) -> 'Dist':\n",
    "        \"\"\"You can multiply a Dist by a scalar weight.\"\"\"\n",
    "        return Dist({outcome: p * weight for outcome, p in self.items()})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The function `series_results` returns a distribution of possible series outcomes. The parameters are:\n",
    "- *p*, the seeded team's single-game win probability,\n",
    "- *h*, the home-court advantage probability\n",
    "- *home*, a 7-character string saying which games are home or away for the seeded team\n",
    "- *W* and *L*, the number of wins and losses for the seeded team so far in the series (default 0 of each)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def series_results(p=0.50, h=0.10, home='HHAAHAH', W=0, L=0) -> Dist:\n",
    "    \"\"\"Return {(win, loss): probability, ...} for all possible outcomes of the series, given\n",
    "    the single-game win probability for the seeded team, `p`, and the home-court advantage, `h`.\"\"\"\n",
    "    def results(W: int, L: int) -> Dist:\n",
    "        if W == 4 or L == 4:\n",
    "            return Dist({(W, L): 1.0})\n",
    "        else:\n",
    "            p1 = p + (h if home[W + L] == 'H' else -h) # Probability of winning this one game\n",
    "            return Dist(results(W + 1, L) * p1  +\n",
    "                        results(W, L + 1) * (1 - p1))\n",
    "    return results(W, L)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's look at the results for a truly even series, where each game is 50/50, and there is no home-court advantage:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Dist({(4, 2): 0.15625,\n",
       "      (4, 3): 0.15625,\n",
       "      (3, 4): 0.15625,\n",
       "      (2, 4): 0.15625,\n",
       "      (4, 1): 0.125,\n",
       "      (1, 4): 0.125,\n",
       "      (4, 0): 0.0625,\n",
       "      (0, 4): 0.0625})"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "series_results(p=0.50, h=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The four most common outcomes are equally likely: either 6 or 7 games, with either team winning.  It is easy to see that 6 or 7 games is equally likely: after 5 games, if the series isn't over, it must be 3-2 or 2-3. Half the time the \"3\" team will win game 6 (resulting in a 6 game series) and half the time they will lose (resulting in a 7 game series).\n",
    "\n",
    "Now I would like to make a **table** of results, for various winning percentages *p*. For each *p* I want to see \n",
    "- The probability for each possible series outcome (4-3, 4-2, etc.).\n",
    "- The probability for each possible series length (4, 5, 6, or 7 games).\n",
    "- The probability that the seeded team wins the series.\n",
    "\n",
    "*Note*: I will consider cases where the higher-seeded team has a game win probability less than 50% (as sometimes happens when a player is injured)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "from numpy import arange\n",
    "from typing import Sequence\n",
    "\n",
    "def series_results_table(h=0.0, pcts=arange(0.48, 0.70, 0.02)):\n",
    "    \"\"\"What happens in the series for various values of `p` and a giv en value of `h`?\"\"\"\n",
    "    outcomes = [(4, 3), (4, 2), (4, 1), (4, 0), (0, 4), (1, 4), (2, 4), (3, 4)]\n",
    "    bar = f'----+'  + '-' * 5 * len(pcts)\n",
    "    print(f'    | Game Win Percentage (± home-court advantage = {h:.0%})')\n",
    "    print(row('', pcts))\n",
    "    print(bar)\n",
    "    for (W, L) in outcomes:\n",
    "        results = [series_results(p, h)[W, L] for p in pcts]\n",
    "        print(row(f'{W}-{L}', results))\n",
    "    print(bar)\n",
    "    for N in [7, 6, 5, 4]:\n",
    "        results = [series_results(p, h)[4, N-4] + series_results(p, h)[N-4, 4] for p in pcts]\n",
    "        print(row(N, results))\n",
    "    print(bar)\n",
    "    print(row('Win', [sum(series_results(p, h)[W, L] for W, L in outcomes if W == 4) for p in pcts]))\n",
    "\n",
    "def row(name, pcts: Sequence[float]) -> str:\n",
    "    \"\"\"Create a string representing a row in the table.\"\"\"\n",
    "    return f'{name:^3} | ' + ' '.join(f'{p*100:4.1f}' for p in pcts)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here is the table when there is no home-court advantage:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "    | Game Win Percentage (± home-court advantage = 0%)\n",
      "    | 48.0 50.0 52.0 54.0 56.0 58.0 60.0 62.0 64.0 66.0 68.0\n",
      "----+-------------------------------------------------------\n",
      "4-3 | 14.9 15.6 16.2 16.6 16.8 16.8 16.6 16.2 15.7 14.9 14.0\n",
      "4-2 | 14.4 15.6 16.8 18.0 19.0 20.0 20.7 21.3 21.7 21.9 21.9\n",
      "4-1 | 11.0 12.5 14.0 15.6 17.3 19.0 20.7 22.5 24.2 25.8 27.4\n",
      "4-0 |  5.3  6.2  7.3  8.5  9.8 11.3 13.0 14.8 16.8 19.0 21.4\n",
      "0-4 |  7.3  6.2  5.3  4.5  3.7  3.1  2.6  2.1  1.7  1.3  1.0\n",
      "1-4 | 14.0 12.5 11.0  9.7  8.4  7.2  6.1  5.2  4.3  3.5  2.9\n",
      "2-4 | 16.8 15.6 14.4 13.1 11.8 10.5  9.2  8.0  6.9  5.8  4.8\n",
      "3-4 | 16.2 15.6 14.9 14.1 13.2 12.1 11.1  9.9  8.8  7.7  6.6\n",
      "----+-------------------------------------------------------\n",
      " 7  | 31.1 31.2 31.1 30.7 29.9 28.9 27.6 26.2 24.5 22.6 20.6\n",
      " 6  | 31.2 31.2 31.2 31.0 30.8 30.4 30.0 29.4 28.6 27.8 26.7\n",
      " 5  | 25.1 25.0 25.1 25.3 25.7 26.2 26.9 27.6 28.5 29.3 30.2\n",
      " 4  | 12.6 12.5 12.6 13.0 13.6 14.4 15.5 16.9 18.5 20.3 22.4\n",
      "----+-------------------------------------------------------\n",
      "Win | 45.6 50.0 54.4 58.7 62.9 67.1 71.0 74.8 78.3 81.6 84.7\n"
     ]
    }
   ],
   "source": [
    "series_results_table(h=0.0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that a 6-game series is most likely, except when *p* is exactly 50% (in which case 6- and 7-game series are equally likely), or when *p* is 66% or more (in which case a 5-game series is more likely).\n",
    "\n",
    "In recent years the home-court advantage has been about 5%:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "    | Game Win Percentage (± home-court advantage = 5%)\n",
      "    | 48.0 50.0 52.0 54.0 56.0 58.0 60.0 62.0 64.0 66.0 68.0\n",
      "----+-------------------------------------------------------\n",
      "4-3 | 16.6 17.3 17.8 18.2 18.4 18.3 18.1 17.6 17.0 16.1 15.1\n",
      "4-2 | 13.2 14.4 15.5 16.6 17.6 18.4 19.1 19.6 20.0 20.1 20.0\n",
      "4-1 | 12.2 13.7 15.4 17.1 18.9 20.7 22.5 24.4 26.2 27.9 29.6\n",
      "4-0 |  5.2  6.1  7.2  8.4  9.7 11.1 12.8 14.6 16.6 18.8 21.2\n",
      "0-4 |  7.2  6.1  5.2  4.4  3.7  3.0  2.5  2.0  1.6  1.3  1.0\n",
      "1-4 | 12.7 11.2  9.9  8.6  7.4  6.3  5.3  4.5  3.7  3.0  2.4\n",
      "2-4 | 18.2 16.9 15.5 14.1 12.7 11.3  9.9  8.6  7.4  6.3  5.2\n",
      "3-4 | 14.7 14.1 13.5 12.6 11.7 10.8  9.7  8.7  7.6  6.6  5.6\n",
      "----+-------------------------------------------------------\n",
      " 7  | 31.3 31.4 31.3 30.8 30.1 29.1 27.8 26.3 24.6 22.7 20.7\n",
      " 6  | 31.5 31.3 31.1 30.7 30.3 29.7 29.1 28.3 27.4 26.4 25.3\n",
      " 5  | 24.9 25.0 25.3 25.7 26.3 27.0 27.9 28.8 29.8 30.9 31.9\n",
      " 4  | 12.4 12.3 12.4 12.7 13.3 14.2 15.3 16.6 18.2 20.0 22.1\n",
      "----+-------------------------------------------------------\n",
      "Win | 47.2 51.6 56.0 60.3 64.5 68.6 72.5 76.2 79.7 82.9 85.8\n"
     ]
    }
   ],
   "source": [
    "series_results_table(h=0.05)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, when the seeded team has a game win probability in the range 50% to 54%, the other team is favored to win game 6,  making a 7-game series more likely. Overall, a home-court advantage of 5% gives the seeded team about a 1.6% better chance of winning the series.\n",
    "\n",
    "Some people think a home-court advantage of as much as 13% is reasonable:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "    | Game Win Percentage (± home court advantage = 13%)\n",
      "    | 48.0 50.0 52.0 54.0 56.0 58.0 60.0 62.0 64.0 66.0\n",
      "----+--------------------------------------------------\n",
      "4-3 | 19.8 20.5 21.1 21.4 21.5 21.4 21.0 20.3 19.4 18.3\n",
      "4-2 | 11.5 12.6 13.6 14.6 15.5 16.3 16.9 17.4 17.6 17.7\n",
      "4-1 | 13.9 15.7 17.6 19.5 21.6 23.6 25.7 27.8 29.9 31.9\n",
      "4-0 |  4.6  5.4  6.4  7.5  8.8 10.2 11.8 13.5 15.4 17.5\n",
      "0-4 |  6.4  5.4  4.6  3.8  3.1  2.5  2.0  1.6  1.3  1.0\n",
      "1-4 | 10.5  9.2  8.0  6.8  5.8  4.8  4.0  3.2  2.6  2.0\n",
      "2-4 | 20.7 19.1 17.4 15.7 14.1 12.4 10.9  9.4  8.0  6.7\n",
      "3-4 | 12.7 12.1 11.4 10.5  9.7  8.7  7.7  6.8  5.8  4.9\n",
      "----+--------------------------------------------------\n",
      "  7 | 32.5 32.6 32.5 32.0 31.2 30.1 28.7 27.1 25.2 23.2\n",
      "  6 | 32.1 31.6 31.0 30.3 29.6 28.7 27.8 26.7 25.6 24.4\n",
      "  5 | 24.4 24.9 25.5 26.3 27.3 28.5 29.7 31.1 32.5 33.9\n",
      "  4 | 11.0 10.9 11.0 11.3 11.9 12.8 13.8 15.1 16.7 18.5\n",
      "----+--------------------------------------------------\n",
      "Win | 49.7 54.2 58.7 63.1 67.4 71.5 75.4 79.0 82.4 85.5\n"
     ]
    }
   ],
   "source": [
    "series_results_table(h=0.13)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This means that a 50/50 team with a 13% home-court advantage wins a series about as often as a 52% team with no home-court advantage."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
